class: center, middle, inverse, title-slide .title[ # Three common mistakes in statistics and how to avoid them ] .author[ ### Elizabeth Pankratz ] .institute[ ### Department of Psychology
The University of Edinburgh ] --- class: middle ## But first: [mentimeter link] [mentimeter qr code] --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- -- --> .pull-left[
**A common R programming mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ ] <!-- -- --> .pull-left[
**An advanced statistical mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ ] <!-- -- --> .pull-left[
**A foundational statistical mistake:** Interpreting a significant *p*-value as proof that an effect exists. ] .pull-right[ ] --- class: inverse, middle, center
# The data we'll use --- ## The SMARVUS dataset (Terry et al., 2023) .center[SMARVUS = **S**tatistics and **M**athematics **A**nxieties and **R**elated **V**ariables in **U**niversity **S**tudents] .pull-left[ A survey of *n* = 18,841 students (mostly Psychology UGs) from 35 countries. Students rated their anxiety from 1 (no anxiety) to 5 (a great deal of anxiety) in scenarios like: - Studying for a statistics test. - Interpreting the meaning of a table in a journal article. - **Going to ask my statistics teacher for individual help with material I am having difficulty understanding.** ] .pull-right[ <img src="data:image/png;base64,#demo_files/figure-html/bar-aggregated-1.png" width="504" style="display: block; margin: auto;" /> ] --- ## Why Likert scale ratings are not numeric .center[  ] --- ## And yet... .center[  ] Figure 2 from Reeder et al. (2017), published in Journal of Memory and Language! --- ## By default, R will keep numeric-looking variables numeric If we allow R's default behaviour, then we can do naughty things with categorical variables: ``` r mean(anx$rating) ``` ``` ## [1] 2.868054 ``` But if we store these variables as factors, the naughty things become impossible (yay!): ``` r anx <- anx |> mutate(rating = factor(rating)) mean(anx$rating) ``` ``` ## [1] NA ``` --- # LOs --- class: inverse, middle, center
# Modelling an ordinal variable ### The .mono-white[polr()] express --- ## Fit an ordinal regression model ``` r library(MASS) # MASS contains the polr() function # (polr = Proportional Odds Logistic Regression) anx_fit1 <- polr( rating ~ 1, # intercept-only model, to start data = anx, Hess = TRUE, # required if we want to use summary() method = 'probit' # more on this in a moment ) ``` --- ## Fit an ordinal regression model ``` r summary(anx_fit1) ``` ``` ## Call: ## polr(formula = rating ~ 1, data = anx, Hess = TRUE, method = "probit") ## ## No coefficients ## ## Intercepts: ## Value Std. Error t value ## 1|2 -0.8420 0.0157 -53.7268 ## 2|3 -0.1678 0.0138 -12.1462 ## 3|4 0.3833 0.0141 27.1512 ## 4|5 1.0339 0.0168 61.6193 ## ## Residual Deviance: 26596.28 ## AIC: 26604.28 ``` --- ## What do those `Intercepts` mean? <img src="data:image/png;base64,#demo_files/figure-html/plot-underlying-normal-1.png" width="864" style="display: block; margin: auto;" /> ??? - imagine that there's some underlying continuous normal distribution of anxiety, assumed standard normal [show normal distrib] - ppl with high anxiety are more likely to give high responses, ppl with low anxiety more likely to give low responses (could do emojis relating to anxiety:
,
) - so to estimate how different anxiety levels translate to different responses on the 1--5 scale, we draw thresholds on that distribution [add thresholds] - ppl with anxiety in this bin will respond with 1, in this bin with 2, etc. - and those thresholds, the cutpoints btwn ratings, are the intercepts. - [show intercept estimates, put thoes same numbers on the thresholds] - normal distribution assumption is from method = probit. other methods assume other underlying distributions, but the idea of thresholds is the same. --- class: inverse, middle, center
# "Preregistration" activity ### The effect of gender on stats anxiety --- #### How will a student's gender affect ratings for "Going to ask my statistics teacher for individual help with material I am having difficulty understanding"? <img src="data:image/png;base64,#demo_files/figure-html/plot-gender-bars-1.png" width="792" style="display: block; margin: auto;" /> --- Activity is part of the handout.
**For one minute:** Think/write some predictions to yourself.
**For one minute:** Ask your neighbour what they predicted. What was their reasoning? What was yours?
**Afterward we'll look at the model's estimates together.** ] ??? The puzzle pieces: - The threshold values will stay in the same place. - If you move the normal distribution left, then more probability density will end up toward the lower end of the scale (less anxiety). If you move the distribution right, more probability density will be toward the higher end of the scale (more anxiety). - Each gender gets one probability distribution over anxiety values. - Here's the one for Female/Woman. - I want you to "preregister" a guess, before we look at the outcome of the model, whether the distribution for the Male/Men group will be shifted left or right from the Female one, and whether the distribution for the Another Gender group will be shifted left or right from the female one. Based on this data. - Draw it on the plot on the handout, if you want. --- ## How does gender affect anxiety ratings? .center[ <img src="data:image/png;base64,#demo_files/figure-html/all-gender-normals-1.png" width="864" style="display: block; margin: auto;" /> ] --- ``` r anx_fit2 <- polr( rating ~ gender, data = anx, method = 'probit', Hess = TRUE ) summary(anx_fit2) ``` ``` ## Call: ## polr(formula = rating ~ gender, data = anx, Hess = TRUE, method = "probit") ## ## Coefficients: ## Value Std. Error t value ## genderMale/Man -0.3280 0.03015 -10.880 ## genderAnother Gender 0.4846 0.11992 4.041 ## ## Intercepts: ## Value Std. Error t value ## 1|2 -0.9045 0.0169 -53.5402 ## 2|3 -0.2246 0.0150 -14.9847 ## 3|4 0.3318 0.0151 21.9158 ## 4|5 0.9889 0.0176 56.2958 ## ## Residual Deviance: 26456.99 ## AIC: 26468.99 ``` --- # LOs --- class: inverse, middle, center
# Interpreting *p*-values --- ## Are the effects of `gender` significant? ``` Coefficients: Value Std. Error t value genderMale/Man -0.3280 0.03015 -10.880 genderAnother Gender 0.4846 0.11992 4.041 ``` No *p*-values in the model summary. But it's common practice to compare `t value` to a standard normal distribution (treating them like *z* scores). ``` r pnorm(abs(-10.880), lower.tail = FALSE) * 2 ``` ``` ## [1] 1.43563e-27 ``` ``` r pnorm(abs( 4.041), lower.tail = FALSE) * 2 ``` ``` ## [1] 5.322376e-05 ``` Since both *p*-values are below 0.05: - we CAN reject the null hypothesis that gender has no effect on ratings. - we CANNOT conclude that there really is an effect of gender. --- #### Why doesn't a significant *p*-value mean that there really is an effect? Because we can also get significant *p*-values when there really is *no* effect. .pull-left[ No difference in reality: <img src="data:image/png;base64,#demo_files/figure-html/true-skew-probdist-1.png" width="504" style="display: block; margin: auto;" /> ] .pull-right[ A possible random sample (*n* = 50 per group): <img src="data:image/png;base64,#demo_files/figure-html/simdat-1.png" width="504" style="display: block; margin: auto;" /> ] --- ``` r sim_fit <- polr(rating ~ group, data = simdat, method = 'probit', Hess = TRUE) summary(sim_fit) ``` ``` ## Call: ## polr(formula = rating ~ group, data = simdat, Hess = TRUE, method = "probit") ## ## Coefficients: ## Value Std. Error t value ## groupGroup B -0.4479 0.2229 -2.009 ## ## Intercepts: ## Value Std. Error t value ## 1|2 -0.3244 0.1677 -1.9344 ## 2|3 0.1738 0.1664 1.0447 ## 3|4 0.9263 0.1879 4.9283 ## 4|5 1.4570 0.2333 6.2444 ## ## Residual Deviance: 267.4865 ## AIC: 277.4865 ``` ``` r pnorm(abs(-2.009), lower.tail = FALSE) * 2 ``` ``` ## [1] 0.04453713 ``` --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- --> .pull-left[
**A common R programming mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] <!-- --> .pull-left[
**An advanced statistical mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ Apply and interpret ordinal regression models (e.g., `polr()` from `MASS`). ] <!-- --> .pull-left[
**A foundational statistical mistake:** Interpreting a significant *p*-value as proof that an effect exists. ] .pull-right[ Understand that significant *p*-values can arise even if no effect exists. ] --- ## TODO References Reeder et al. 2017 Terry et al. 2023